4 research outputs found
Isolate First, Then Share: a New OS Architecture for Datacenter Computing
This paper presents the "isolate first, then share" OS model in which the
processor cores, memory, and devices are divided up between disparate OS
instances and a new abstraction, subOS, is proposed to encapsulate an OS
instance that can be created, destroyed, and resized on-the-fly. The intuition
is that this avoids shared kernel states between applications, which in turn
reduces performance loss caused by contention. We decompose the OS into the
supervisor and several subOSes running at the same privilege level: a subOS
directly manages physical resources, while the supervisor can create, destroy,
resize a subOS on-the-fly. The supervisor and subOSes have few state sharing,
but fast inter-subOS communication mechanisms are provided on demand.
We present the first implementation, RainForest, which supports unmodified
Linux binaries. Our comprehensive evaluation shows RainForest outperforms Linux
with four different kernels, LXC, and Xen in terms of worst-case and average
performance most of time when running a large number of benchmarks. The source
code is available soon.Comment: 14 pages, 13 figures, 5 table
AIBench Training: Balanced Industry-Standard AI Training Benchmarking
Earlier-stage evaluations of a new AI architecture/system need affordable AI
benchmarks, while using a few AI component benchmarks alone in the other stages
may lead to misleading conclusions. This paper proposes a balanced benchmarking
methodology. Performing an exhaustive survey on Internet service AI domains, we
identify and implement seventeen representative AI tasks with the
state-of-the-art models to guarantee the diversity and representativeness of
the benchmarks. Meanwhile, we keep a benchmark subset to a minimum for
affordability. We contribute by far the most comprehensive AI training
benchmark suite with seventeen industry partners.
The evaluations show: (1) AIBench Training outperforms MLPerf Training in
terms of the diversity and representativeness of model complexity,
computational cost, convergent rate, computation and memory access patterns,
and hotspot functions; (2) With respect to the AIBench full benchmarks, its
subset shortens the benchmarking cost by 54%, while maintaining the primary
workload characteristics; (3) The performance ranking shows the single-purpose
AI accelerator like TPU with the optimized TensorFlow framework performs better
than that of GPUs while losing the latters' general support for a variety of AI
models.
The AIBench Training specifications, source code, testbed, and performance
numbers are publicly available from the web site
http://www.benchcouncil.org/AIBench/index.html
AIBench: An Industry Standard Internet Service AI Benchmark Suite
Today's Internet Services are undergoing fundamental changes and shifting to
an intelligent computing era where AI is widely employed to augment services.
In this context, many innovative AI algorithms, systems, and architectures are
proposed, and thus the importance of benchmarking and evaluating them rises.
However, modern Internet services adopt a microservice-based architecture and
consist of various modules. The diversity of these modules and complexity of
execution paths, the massive scale and complex hierarchy of datacenter
infrastructure, the confidential issues of data sets and workloads pose great
challenges to benchmarking. In this paper, we present the first
industry-standard Internet service AI benchmark suite---AIBench with seventeen
industry partners, including several top Internet service providers. AIBench
provides a highly extensible, configurable, and flexible benchmark framework
that contains loosely coupled modules. We identify sixteen prominent AI problem
domains like learning to rank, each of which forms an AI component benchmark,
from three most important Internet service domains: search engine, social
network, and e-commerce, which is by far the most comprehensive AI benchmarking
effort. On the basis of the AIBench framework, abstracting the real-world data
sets and workloads from one of the top e-commerce providers, we design and
implement the first end-to-end Internet service AI benchmark, which contains
the primary modules in the critical paths of an industry scale application and
is scalable to deploy on different cluster scales. The specifications, source
code, and performance numbers are publicly available from the benchmark council
web site http://www.benchcouncil.org/AIBench/index.html.Comment: 24 page
AIBench: An Agile Domain-specific Benchmarking Methodology and an AI Benchmark Suite
Domain-specific software and hardware co-design is encouraging as it is much
easier to achieve efficiency for fewer tasks. Agile domain-specific
benchmarking speeds up the process as it provides not only relevant design
inputs but also relevant metrics, and tools. Unfortunately, modern workloads
like Big data, AI, and Internet services dwarf the traditional one in terms of
code size, deployment scale, and execution path, and hence raise serious
benchmarking challenges.
This paper proposes an agile domain-specific benchmarking methodology.
Together with seventeen industry partners, we identify ten important end-to-end
application scenarios, among which sixteen representative AI tasks are
distilled as the AI component benchmarks. We propose the permutations of
essential AI and non-AI component benchmarks as end-to-end benchmarks. An
end-to-end benchmark is a distillation of the essential attributes of an
industry-scale application. We design and implement a highly extensible,
configurable, and flexible benchmark framework, on the basis of which, we
propose the guideline for building end-to-end benchmarks, and present the first
end-to-end Internet service AI benchmark.
The preliminary evaluation shows the value of our benchmark suite---AIBench
against MLPerf and TailBench for hardware and software designers,
micro-architectural researchers, and code developers. The specifications,
source code, testbed, and results are publicly available from the web site
\url{http://www.benchcouncil.org/AIBench/index.html}.Comment: 25 pages, 7 figures. arXiv admin note: substantial text overlap with
arXiv:1908.0899